Search CORE

16 research outputs found

Blending Generative Adversarial Image Synthesis with Rendering for Computer Graphics

Author: Koc Ibrahim Mert
Redmill Keith A.
Yang Dongfang
Yurtsever Ekim
Publication venue
Publication date: 30/07/2020
Field of study

Conventional computer graphics pipelines require detailed 3D models, meshes, textures, and rendering engines to generate 2D images from 3D scenes. These processes are labor-intensive. We introduce Hybrid Neural Computer Graphics (HNCG) as an alternative. The contribution is a novel image formation strategy to reduce the 3D model and texture complexity of computer graphics pipelines. Our main idea is straightforward: Given a 3D scene, render only important objects of interest and use generative adversarial processes for synthesizing the rest of the image. To this end, we propose a novel image formation strategy to form 2D semantic images from 3D scenery consisting of simple object models without textures. These semantic images are then converted into photo-realistic RGB images with a state-of-the-art conditional Generative Adversarial Network (cGAN) based image synthesizer trained on real-world data. Meanwhile, objects of interest are rendered using a physics-based graphics engine. This is necessary as we want to have full control over the appearance of objects of interest. Finally, the partially-rendered and cGAN synthesized images are blended with a blending GAN. We show that the proposed framework outperforms conventional rendering with ablation and comparison studies. Semantic retention and Fr\'echet Inception Distance (FID) measurements were used as the main performance metrics

arXiv.org e-Print Archive

Vision Language Models in Autonomous Driving and Intelligent Transportation Systems

Author: Knoll Alois C.
Liu Mingyu
Yurtsever Ekim
Zagar Bare Luka
Zhou Xingcheng
Publication venue
Publication date: 22/10/2023
Field of study

The applications of Vision-Language Models (VLMs) in the fields of Autonomous Driving (AD) and Intelligent Transportation Systems (ITS) have attracted widespread attention due to their outstanding performance and the ability to leverage Large Language Models (LLMs). By integrating language data, the vehicles, and transportation systems are able to deeply understand real-world environments, improving driving safety and efficiency. In this work, we present a comprehensive survey of the advances in language models in this domain, encompassing current models and datasets. Additionally, we explore the potential applications and emerging research directions. Finally, we thoroughly discuss the challenges and research gap. The paper aims to provide researchers with the current work and future trends of VLMs in AD and ITS

arXiv.org e-Print Archive

3D Understanding of Deformable Linear Objects: Datasets and Transferability Benchmark

Author: Hertel Tim
Knoll ALois C.
Liu Mingyu
Yurtsever Ekim
Žagar Bare Luka
Publication venue
Publication date: 13/10/2023
Field of study

Deformable linear objects are vastly represented in our everyday lives. It is often challenging even for humans to visually understand them, as the same object can be entangled so that it appears completely different. Examples of deformable linear objects include blood vessels and wiring harnesses, vital to the functioning of their corresponding systems, such as the human body and a vehicle. However, no point cloud datasets exist for studying 3D deformable linear objects. Therefore, we are introducing two point cloud datasets, PointWire and PointVessel. We evaluated state-of-the-art methods on the proposed large-scale 3D deformable linear object benchmarks. Finally, we analyzed the generalization capabilities of these methods by conducting transferability experiments on the PointWire and PointVessel datasets

arXiv.org e-Print Archive

Risky Action Recognition in Lane Change Video Clips using Deep Spatiotemporal Networks with Segmentation Mask Transfer

Author: Hansen John H. L.
Lambert Jacob
Liu Yongkang
Miyajima Chiyomi
Takeda Kazuya
Takeuchi Eijiro
Yurtsever Ekim
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 02/02/2020
Field of study

Advanced driver assistance and automated driving systems rely on risk estimation modules to predict and avoid dangerous situations. Current methods use expensive sensor setups and complex processing pipeline, limiting their availability and robustness. To address these issues, we introduce a novel deep learning based action recognition framework for classifying dangerous lane change behavior in short video clips captured by a monocular camera. We designed a deep spatiotemporal classification network that uses pre-trained state-of-the-art instance segmentation network Mask R-CNN as its spatial feature extractor for this task. The Long-Short Term Memory (LSTM) and shallower final classification layers of the proposed method were trained on a semi-naturalistic lane change dataset with annotated risk labels. A comprehensive comparison of state-of-the-art feature extractors was carried out to find the best network layout and training strategy. The best result, with a 0.937 AUC score, was obtained with the proposed network. Our code and trained models are available open-source.Comment: 8 pages, 3 figures, 1 table. The code is open-sourc

arXiv.org e-Print Archive

Crossref

Integrating Context in Driving Behavior Modeling through Semiotic Analysis and Deep Learning

Author: YURTSEVER Ekim
Publication venue
Publication date
Field of study

Institutional Repositories DataBase (IRDB)